Efficient Word Retrieval by Means of SOM Clustering and PCA
نویسندگان
چکیده
We propose an approach for efficient word retrieval from printed documents belonging to Digital Libraries. The approach combines word image clustering (based on Self Organizing Maps, SOM) with Principal Component Analysis. The combination of these methods allows us to efficiently retrieve the matching words from large documents collections without the need for a direct comparison of the query word with each indexed word.
منابع مشابه
SOM-based Document Image Retrieval
In this paper we discuss some applications of word image clustering (based on Self Organizing Maps, SOM) for tasks related to document image retrieval. Two main applications are discussed: document retrieval and word retrieval. In document retrieval a document representation based on the vector model is obtained by computing the occurrences of words belonging to the SOM clusters in each documen...
متن کاملEfficient Data Retrieval using Combine Approach of SOM and K-Mean Clustering
Emergence of recent techniques for scientific knowledge collection has resulted in large scale accumulation of information relating various fields. Typical info querying ways are inadequate to extract helpful data from huge knowledge banks. Cluster analysis is one of the key knowledge analysis way and the k-means clustering algorithm is widely used for several data mining applications. The anal...
متن کاملA Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملA Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features
Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...
متن کاملSOM clustering for text retrieval and classification with examples on Indian scripts
In this paper, we discuss the use of Self Organizing Maps (SOM) for character and word clustering. The SOM is a particular kind of artificial neural network that computes an unsupervised clustering of the input data arranging the cluster centers in a lattice. After an overview of the previous applications of unsupervised learning and SOM in the field of Document Image Analysis we describe our r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006